Your browser doesn't support javascript.
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
1.
Nucleic Acids Res ; 50(D1): D497-D508, 2022 01 07.
Artículo en Inglés | MEDLINE | ID: covidwho-2232151

RESUMEN

Almost twenty years after its initial release, the Eukaryotic Linear Motif (ELM) resource remains an invaluable source of information for the study of motif-mediated protein-protein interactions. ELM provides a comprehensive, regularly updated and well-organised repository of manually curated, experimentally validated short linear motifs (SLiMs). An increasing number of SLiM-mediated interactions are discovered each year and keeping the resource up-to-date continues to be a great challenge. In the current update, 30 novel motif classes have been added and five existing classes have undergone major revisions. The update includes 411 new motif instances mostly focused on cell-cycle regulation, control of the actin cytoskeleton, membrane remodelling and vesicle trafficking pathways, liquid-liquid phase separation and integrin signalling. Many of the newly annotated motif-mediated interactions are targets of pathogenic motif mimicry by viral, bacterial or eukaryotic pathogens, providing invaluable insights into the molecular mechanisms underlying infectious diseases. The current ELM release includes 317 motif classes incorporating 3934 individual motif instances manually curated from 3867 scientific publications. ELM is available at: http://elm.eu.org.


Asunto(s)
Enfermedades Transmisibles/genética , Bases de Datos de Proteínas , Interacciones Huésped-Patógeno/genética , Dominios y Motivos de Interacción de Proteínas , Programas Informáticos , Citoesqueleto de Actina/química , Citoesqueleto de Actina/metabolismo , Animales , Sitios de Unión , Ciclo Celular/genética , Membrana Celular/química , Membrana Celular/metabolismo , Enfermedades Transmisibles/metabolismo , Enfermedades Transmisibles/virología , Ciclinas/química , Ciclinas/genética , Ciclinas/metabolismo , Células Eucariotas/citología , Células Eucariotas/metabolismo , Células Eucariotas/virología , Regulación de la Expresión Génica , Humanos , Integrinas/química , Integrinas/genética , Integrinas/metabolismo , Ratones , Anotación de Secuencia Molecular , Unión Proteica , Ratas , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Transducción de Señal , Vesículas Transportadoras/química , Vesículas Transportadoras/metabolismo , Virus/genética , Virus/metabolismo
2.
Int J Mol Sci ; 23(2)2022 Jan 14.
Artículo en Inglés | MEDLINE | ID: covidwho-1633064

RESUMEN

Peripheral blood mononuclear cells (PBMCs) belong to the innate and adaptive immune system and are highly sensitive and responsive to changes in their systemic environment. In this study, we focused on the time course of transcriptional changes in freshly isolated human PBMCs 4, 8, 24 and 48 h after onset of stimulation with the active vitamin D metabolite 1α,25-dihydroxyvitamin D3 (1,25(OH)2D3). Taking all four time points together, 662 target genes were identified and segregated either by time of differential gene expression into 179 primary and 483 secondary targets or by driver of expression change into 293 direct and 369 indirect targets. The latter classification revealed that more than 50% of target genes were primarily driven by the cells' response to ex vivo exposure than by the nuclear hormone and largely explained its down-regulatory effect. Functional analysis indicated vitamin D's role in the suppression of the inflammatory and adaptive immune response by down-regulating ten major histocompatibility complex class II genes, five alarmins of the S100 calcium binding protein A family and by affecting six chemokines of the C-X-C motif ligand family. Taken together, studying time-resolved responses allows to better contextualize the effects of vitamin D on the immune system.


Asunto(s)
Inmunidad Adaptativa/genética , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Mediadores de Inflamación/metabolismo , Transcriptoma , Vitamina D/metabolismo , Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Regulación de la Expresión Génica/efectos de los fármacos , Humanos , Inflamación/etiología , Inflamación/metabolismo , Inflamación/patología , Leucocitos Mononucleares/efectos de los fármacos , Leucocitos Mononucleares/inmunología , Leucocitos Mononucleares/metabolismo , Anotación de Secuencia Molecular , Vitamina D/análogos & derivados , Vitamina D/farmacología
3.
Brief Bioinform ; 23(2)2022 03 10.
Artículo en Inglés | MEDLINE | ID: covidwho-1639367

RESUMEN

Genomic epidemiology is important to study the COVID-19 pandemic, and more than two million severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genomic sequences were deposited into public databases. However, the exponential increase of sequences invokes unprecedented bioinformatic challenges. Here, we present the Coronavirus GenBrowser (CGB) based on a highly efficient analysis framework and a node-picking rendering strategy. In total, 1,002,739 high-quality genomic sequences with the transmission-related metadata were analyzed and visualized. The size of the core data file is only 12.20 MB, highly efficient for clean data sharing. Quick visualization modules and rich interactive operations are provided to explore the annotated SARS-CoV-2 evolutionary tree. CGB binary nomenclature is proposed to name each internal lineage. The pre-analyzed data can be filtered out according to the user-defined criteria to explore the transmission of SARS-CoV-2. Different evolutionary analyses can also be easily performed, such as the detection of accelerated evolution and ongoing positive selection. Moreover, the 75 genomic spots conserved in SARS-CoV-2 but non-conserved in other coronaviruses were identified, which may indicate the functional elements specifically important for SARS-CoV-2. The CGB was written in Java and JavaScript. It not only enables users who have no programming skills to analyze millions of genomic sequences, but also offers a panoramic vision of the transmission and evolution of SARS-CoV-2.


Asunto(s)
COVID-19/epidemiología , COVID-19/virología , Vigilancia en Salud Pública/métodos , SARS-CoV-2/genética , Programas Informáticos , Navegador Web , Biología Computacional/métodos , Análisis Mutacional de ADN , Bases de Datos Genéticas , Genoma Viral , Genómica , Humanos , Epidemiología Molecular/métodos , Anotación de Secuencia Molecular , Mutación
4.
Viruses ; 13(12)2021 12 03.
Artículo en Inglés | MEDLINE | ID: covidwho-1554806

RESUMEN

SARS-CoV-2 genomic sequencing efforts have scaled dramatically to address the current global pandemic and aid public health. However, autonomous genome annotation of SARS-CoV-2 genes, proteins, and domains is not readily accomplished by existing methods and results in missing or incorrect sequences. To overcome this limitation, we developed a novel semi-supervised pipeline for automated gene, protein, and functional domain annotation of SARS-CoV-2 genomes that differentiates itself by not relying on the use of a single reference genome and by overcoming atypical genomic traits that challenge traditional bioinformatic methods. We analyzed an initial corpus of 66,000 SARS-CoV-2 genome sequences collected from labs across the world using our method and identified the comprehensive set of known proteins with 98.5% set membership accuracy and 99.1% accuracy in length prediction, compared to proteome references, including Replicase polyprotein 1ab (with its transcriptional slippage site). Compared to other published tools, such as Prokka (base) and VAPiD, we yielded a 6.4- and 1.8-fold increase in protein annotations. Our method generated 13,000,000 gene, protein, and domain sequences-some conserved across time and geography and others representing emerging variants. We observed 3362 non-redundant sequences per protein on average within this corpus and described key D614G and N501Y variants spatiotemporally in the initial genome corpus. For spike glycoprotein domains, we achieved greater than 97.9% sequence identity to references and characterized receptor binding domain variants. We further demonstrated the robustness and extensibility of our method on an additional 4000 variant diverse genomes containing all named variants of concern and interest as of August 2021. In this cohort, we successfully identified all keystone spike glycoprotein mutations in our predicted protein sequences with greater than 99% accuracy as well as demonstrating high accuracy of the protein and domain annotations. This work comprehensively presents the molecular targets to refine biomedical interventions for SARS-CoV-2 with a scalable, high-accuracy method to analyze newly sequenced infections as they arise.


Asunto(s)
COVID-19/virología , Genoma Viral , Anotación de Secuencia Molecular , SARS-CoV-2/genética , Secuencia de Aminoácidos , Secuencia de Bases , Biología Computacional , Humanos , Mutación , Unión Proteica , Dominios Proteicos , Glicoproteína de la Espiga del Coronavirus/genética
5.
Nucleic Acids Res ; 50(D1): D632-D639, 2022 01 07.
Artículo en Inglés | MEDLINE | ID: covidwho-1506219

RESUMEN

Network medicine has proven useful for dissecting genetic organization of complex human diseases. We have previously published HumanNet, an integrated network of human genes for disease studies. Since the release of the last version of HumanNet, many large-scale protein-protein interaction datasets have accumulated in public depositories. Additionally, the numbers of research papers and functional annotations for gene-phenotype associations have increased significantly. Therefore, updating HumanNet is a timely task for further improvement of network-based research into diseases. Here, we present HumanNet v3 (https://www.inetbio.org/humannet/, covering 99.8% of human protein coding genes) constructed by means of the expanded data with improved network inference algorithms. HumanNet v3 supports a three-tier model: HumanNet-PI (a protein-protein physical interaction network), HumanNet-FN (a functional gene network), and HumanNet-XC (a functional network extended by co-citation). Users can select a suitable tier of HumanNet for their study purpose. We showed that on disease gene predictions, HumanNet v3 outperforms both the previous HumanNet version and other integrated human gene networks. Furthermore, we demonstrated that HumanNet provides a feasible approach for selecting host genes likely to be associated with COVID-19.


Asunto(s)
Algoritmos , COVID-19/genética , Enfermedades Transmisibles/genética , Bases de Datos Genéticas , Redes Reguladoras de Genes , Programas Informáticos , COVID-19/virología , Enfermedades Transmisibles/clasificación , Ontología de Genes , Humanos , Internet , Anotación de Secuencia Molecular , Mapeo de Interacción de Proteínas , SARS-CoV-2/patogenicidad
6.
Nucleic Acids Res ; 50(D1): D765-D770, 2022 01 07.
Artículo en Inglés | MEDLINE | ID: covidwho-1462428

RESUMEN

The COVID-19 pandemic has seen unprecedented use of SARS-CoV-2 genome sequencing for epidemiological tracking and identification of emerging variants. Understanding the potential impact of these variants on the infectivity of the virus and the efficacy of emerging therapeutics and vaccines has become a cornerstone of the fight against the disease. To support the maximal use of genomic information for SARS-CoV-2 research, we launched the Ensembl COVID-19 browser; the first virus to be encompassed within the Ensembl platform. This resource incorporates a new Ensembl gene set, multiple variant sets, and annotation from several relevant resources aligned to the reference SARS-CoV-2 assembly. Since the first release in May 2020, the content has been regularly updated using our new rapid release workflow, and tools such as the Ensembl Variant Effect Predictor have been integrated. The Ensembl COVID-19 browser is freely available at https://covid-19.ensembl.org.


Asunto(s)
COVID-19/virología , Bases de Datos Genéticas , SARS-CoV-2/genética , Navegador Web , Coronaviridae/genética , Variación Genética , Genoma Viral , Humanos , Anotación de Secuencia Molecular
7.
Nucleic Acids Res ; 49(D1): D916-D923, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: covidwho-1387963

RESUMEN

The GENCODE project annotates human and mouse genes and transcripts supported by experimental data with high accuracy, providing a foundational resource that supports genome biology and clinical genomics. GENCODE annotation processes make use of primary data and bioinformatic tools and analysis generated both within the consortium and externally to support the creation of transcript structures and the determination of their function. Here, we present improvements to our annotation infrastructure, bioinformatics tools, and analysis, and the advances they support in the annotation of the human and mouse genomes including: the completion of first pass manual annotation for the mouse reference genome; targeted improvements to the annotation of genes associated with SARS-CoV-2 infection; collaborative projects to achieve convergence across reference annotation databases for the annotation of human and mouse protein-coding genes; and the first GENCODE manually supervised automated annotation of lncRNAs. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.


Asunto(s)
COVID-19/prevención & control , Biología Computacional/métodos , Bases de Datos Genéticas , Genómica/métodos , Anotación de Secuencia Molecular/métodos , SARS-CoV-2/genética , Animales , COVID-19/epidemiología , COVID-19/virología , Epidemias , Humanos , Internet , Ratones , Seudogenes/genética , ARN Largo no Codificante/genética , SARS-CoV-2/metabolismo , SARS-CoV-2/fisiología , Transcripción Genética/genética
8.
Nucleic Acids Res ; 49(D1): D266-D273, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: covidwho-1387962

RESUMEN

CATH (https://www.cathdb.info) identifies domains in protein structures from wwPDB and classifies these into evolutionary superfamilies, thereby providing structural and functional annotations. There are two levels: CATH-B, a daily snapshot of the latest domain structures and superfamily assignments, and CATH+, with additional derived data, such as predicted sequence domains, and functionally coherent sequence subsets (Functional Families or FunFams). The latest CATH+ release, version 4.3, significantly increases coverage of structural and sequence data, with an addition of 65,351 fully-classified domains structures (+15%), providing 500 238 structural domains, and 151 million predicted sequence domains (+59%) assigned to 5481 superfamilies. The FunFam generation pipeline has been re-engineered to cope with the increased influx of data. Three times more sequences are captured in FunFams, with a concomitant increase in functional purity, information content and structural coverage. FunFam expansion increases the structural annotations provided for experimental GO terms (+59%). We also present CATH-FunVar web-pages displaying variations in protein sequences and their proximity to known or predicted functional sites. We present two case studies (1) putative cancer drivers and (2) SARS-CoV-2 proteins. Finally, we have improved links to and from CATH including SCOP, InterPro, Aquaria and 2DProt.


Asunto(s)
Biología Computacional/estadística & datos numéricos , Bases de Datos de Proteínas/estadística & datos numéricos , Dominios Proteicos , Proteínas/química , Secuencia de Aminoácidos , COVID-19/epidemiología , COVID-19/prevención & control , COVID-19/virología , Biología Computacional/métodos , Epidemias , Humanos , Internet , Anotación de Secuencia Molecular , Proteínas/genética , Proteínas/metabolismo , SARS-CoV-2/genética , SARS-CoV-2/metabolismo , SARS-CoV-2/fisiología , Análisis de Secuencia de Proteína/métodos , Homología de Secuencia de Aminoácido , Proteínas Virales/química , Proteínas Virales/genética , Proteínas Virales/metabolismo
9.
Nucleic Acids Res ; 49(D1): D92-D96, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: covidwho-1387961

RESUMEN

GenBank® (https://www.ncbi.nlm.nih.gov/genbank/) is a comprehensive, public database that contains 9.9 trillion base pairs from over 2.1 billion nucleotide sequences for 478 000 formally described species. Daily data exchange with the European Nucleotide Archive and the DNA Data Bank of Japan ensures worldwide coverage. Recent updates include new resources for data from the SARS-CoV-2 virus, updates to the NCBI Submission Portal and associated submission wizards for dengue and SARS-CoV-2 viruses, new taxonomy queries for viruses and prokaryotes, and simplified submission processes for EST and GSS sequences.


Asunto(s)
Biología Computacional/estadística & datos numéricos , Bases de Datos de Ácidos Nucleicos , Genómica/métodos , SARS-CoV-2/genética , Análisis de Secuencia de ADN/métodos , Animales , COVID-19/epidemiología , COVID-19/virología , Biología Computacional/métodos , Humanos , Almacenamiento y Recuperación de la Información/métodos , Internet , Anotación de Secuencia Molecular/métodos , Pandemias
10.
Brief Bioinform ; 22(2): 1267-1278, 2021 03 22.
Artículo en Inglés | MEDLINE | ID: covidwho-1343631

RESUMEN

Accessory proteins play important roles in the interaction between coronaviruses and their hosts. Accordingly, a comprehensive study of the compositional diversity and evolutionary patterns of accessory proteins is critical to understanding the host adaptation and epidemic variation of coronaviruses. Here, we developed a standardized genome annotation tool for coronavirus (CoroAnnoter) by combining open reading frame prediction, transcription regulatory sequence recognition and homologous alignment. Using CoroAnnoter, we annotated 39 representative coronavirus strains to form a compositional profile for all of the accessary proteins. Large variations were observed in the number of accessory proteins of 1-10 for different coronaviruses, with SARS-CoV-2 and SARS-CoV having the most (9 and 10, respectively). The variation between SARS-CoV and SARS-CoV-2 accessory proteins could be traced back to related coronaviruses in other hosts. The genomic distribution of accessory proteins had significant intra-genus conservation and inter-genus diversity and could be grouped into 1, 4, 2 and 1 types for alpha-, beta-, gamma-, and delta-coronaviruses, respectively. Evolutionary analysis suggested that accessory proteins are more conservative locating before the N-terminal of proteins E and M (E-M), while they are more diverse after these proteins. Furthermore, comparison of virus-host interaction networks of SARS-CoV-2 and SARS-CoV accessory proteins showed that they share multiple antiviral signaling pathways, those involved in the apoptotic process, viral life cycle and response to oxidative stress. In summary, our study provides a tool for coronavirus genome annotation and builds a comprehensive profile for coronavirus accessory proteins covering their composition, classification, evolutionary pattern and host interaction.


Asunto(s)
Evolución Biológica , COVID-19/virología , SARS-CoV-2/metabolismo , Proteínas Virales/genética , Proteínas Virales/metabolismo , Genes Virales , Humanos , Anotación de Secuencia Molecular , Sistemas de Lectura Abierta , Mapas de Interacción de Proteínas , SARS-CoV-2/genética
11.
J Cell Mol Med ; 25(16): 7825-7839, 2021 08.
Artículo en Inglés | MEDLINE | ID: covidwho-1280337

RESUMEN

The new coronavirus pandemic started in China in 2019. The intensity of the disease can range from mild to severe, leading to death in many cases. Despite extensive research in this area, the exact molecular nature of virus is not fully recognized; however, according to pieces of evidence, one of the mechanisms of virus pathogenesis is through the function of viral miRNAs. So, we hypothesized that SARS-CoV-2 pathogenesis may be due to targeting important genes in the host with its miRNAs, which involved in the respiratory system, immune pathways and vitamin D pathways, thus possibly contributing to disease progression and virus survival. Potential miRNA precursors and mature miRNA were predicted and confirmed based on the virus genome. The next step was to predict and identify their target genes and perform functional enrichment analysis to recognize the biological processes connected with these genes in the three pathways mentioned above through several comprehensive databases. Finally, cis-acting regulatory elements in 5' regulatory regions were analysed, and the analysis of available RNAseq data determined the expression level of genes. We revealed that thirty-nine mature miRNAs could theoretically derive from the SARS-CoV-2 genome. Functional enrichment analysis elucidated three highlighted pathways involved in SARS-CoV-2 pathogenesis: vitamin D, immune system and respiratory system. Our finding highlighted genes' involvement in three crucial molecular pathways and may help develop new therapeutic targets related to SARS-CoV-2.


Asunto(s)
COVID-19/inmunología , Interacciones Huésped-Patógeno/fisiología , MicroARNs , SARS-CoV-2/genética , Vitamina D/metabolismo , COVID-19/genética , COVID-19/virología , Regulación de la Expresión Génica , Humanos , Sistema Inmunológico/virología , Anotación de Secuencia Molecular , Regiones Promotoras Genéticas , ARN Viral , Sistema Respiratorio/virología , SARS-CoV-2/patogenicidad
12.
Nat Genet ; 53(6): 809-816, 2021 06.
Artículo en Inglés | MEDLINE | ID: covidwho-1223103

RESUMEN

As the SARS-CoV-2 virus spreads through human populations, the unprecedented accumulation of viral genome sequences is ushering in a new era of 'genomic contact tracing'-that is, using viral genomes to trace local transmission dynamics. However, because the viral phylogeny is already so large-and will undoubtedly grow many fold-placing new sequences onto the tree has emerged as a barrier to real-time genomic contact tracing. Here, we resolve this challenge by building an efficient tree-based data structure encoding the inferred evolutionary history of the virus. We demonstrate that our approach greatly improves the speed of phylogenetic placement of new samples and data visualization, making it possible to complete the placements under the constraints of real-time contact tracing. Thus, our method addresses an important need for maintaining a fully updated reference phylogeny. We make these tools available to the research community through the University of California Santa Cruz SARS-CoV-2 Genome Browser to enable rapid cross-referencing of information in new virus sequences with an ever-expanding array of molecular and structural biology data. The methods described here will empower research and genomic contact tracing for SARS-CoV-2 specifically for laboratories worldwide.


Asunto(s)
COVID-19/epidemiología , COVID-19/virología , Biología Computacional/métodos , Filogenia , SARS-CoV-2/clasificación , SARS-CoV-2/genética , Programas Informáticos , Algoritmos , Biología Computacional/normas , Bases de Datos Genéticas , Genoma Viral , Humanos , Anotación de Secuencia Molecular , Mutación , Navegador Web
13.
Nucleic Acids Res ; 49(D1): D589-D599, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: covidwho-1117395

RESUMEN

PAGER-CoV (http://discovery.informatics.uab.edu/PAGER-CoV/) is a new web-based database that can help biomedical researchers interpret coronavirus-related functional genomic study results in the context of curated knowledge of host viral infection, inflammatory response, organ damage, and tissue repair. The new database consists of 11 835 PAGs (Pathways, Annotated gene-lists, or Gene signatures) from 33 public data sources. Through the web user interface, users can search by a query gene or a query term and retrieve significantly matched PAGs with all the curated information. Users can navigate from a PAG of interest to other related PAGs through either shared PAG-to-PAG co-membership relationships or PAG-to-PAG regulatory relationships, totaling 19 996 993. Users can also retrieve enriched PAGs from an input list of COVID-19 functional study result genes, customize the search data sources, and export all results for subsequent offline data analysis. In a case study, we performed a gene set enrichment analysis (GSEA) of a COVID-19 RNA-seq data set from the Gene Expression Omnibus database. Compared with the results using the standard PAGER database, PAGER-CoV allows for more sensitive matching of known immune-related gene signatures. We expect PAGER-CoV to be invaluable for biomedical researchers to find molecular biology mechanisms and tailored therapeutics to treat COVID-19 patients.


Asunto(s)
Algoritmos , COVID-19/prevención & control , Biología Computacional/métodos , Coronavirus/genética , Bases de Datos Genéticas , SARS-CoV-2/genética , COVID-19/epidemiología , COVID-19/virología , Coronavirus/metabolismo , Curaduría de Datos/métodos , Epidemias , Redes Reguladoras de Genes , Humanos , Almacenamiento y Recuperación de la Información/métodos , Internet , Anotación de Secuencia Molecular/métodos , SARS-CoV-2/metabolismo , SARS-CoV-2/fisiología , Interfaz Usuario-Computador
14.
Mol Cell Biochem ; 476(5): 2203-2217, 2021 May.
Artículo en Inglés | MEDLINE | ID: covidwho-1074462

RESUMEN

Novel strain of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV2) causes mild to severe respiratory illness. The early symptoms may be fever, dry cough, sour throat, and difficulty in breathing which may lead to death in severe cases. Compared to previous outbreaks like SARS-CoV and Middle East Respiratory Syndrome (MERS), SARS-CoV2 disease (COVID-19) outbreak has been much distressing due to its high rate of infection but low infection fatality rate (IFR) with 1.4% around the world. World Health Organization (WHO) has declared (COVID-19) a pandemic on March 11, 2020. In the month of January 2020, the whole genome of SARS-CoV2 was sequenced which made work easy for researchers to develop diagnostic kits and to carry out drug repurposing to effectively alleviate the pandemic situation in the world. Now, it is important to understand why this virus has high rate of infectivity or is there any factor involved at the genome level which actually facilitates this virus infection globally? In this study, we have extensively analyzed the whole genomes of different coronaviruses infecting humans and animals in different geographical locations around the world. The main aim of the study is to identify the similarity and the mutational adaptation of the coronaviruses from different host and geographical locations to the SARS-CoV2 and provide a better strategy to understand the mutational rate for specific target-based drug designing. This study is focused to every annotation in a comparative manner which includes SNPs, repeat analysis with the different categorization of the short-sequence repeats and long-sequence repeats, different UTR's, transcriptional factors, and the predicted matured peptides with the specific length and positions on the genomes. The extensive analysis on SNPs revealed that Wuhan SARS-CoV2 and Indian SARS-CoV2 are having only eight SNPs. Collectively, phylogenetic analysis, repeat analysis, and the polymorphism revealed the genomic conserveness within the SARS-CoV2 and few other coronaviruses with very less mutational chances and the huge distance and mutations from the few other species.


Asunto(s)
COVID-19/genética , Genoma Viral , Coronavirus del Síndrome Respiratorio de Oriente Medio/genética , Anotación de Secuencia Molecular , Filogenia , ARN Viral/genética , SARS-CoV-2/genética , COVID-19/diagnóstico , Estudio de Asociación del Genoma Completo , Humanos
15.
Nucleic Acids Res ; 49(D1): D344-D354, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: covidwho-1048363

RESUMEN

The InterPro database (https://www.ebi.ac.uk/interpro/) provides an integrative classification of protein sequences into families, and identifies functionally important domains and conserved sites. InterProScan is the underlying software that allows protein and nucleic acid sequences to be searched against InterPro's signatures. Signatures are predictive models which describe protein families, domains or sites, and are provided by multiple databases. InterPro combines signatures representing equivalent families, domains or sites, and provides additional information such as descriptions, literature references and Gene Ontology (GO) terms, to produce a comprehensive resource for protein classification. Founded in 1999, InterPro has become one of the most widely used resources for protein family annotation. Here, we report the status of InterPro (version 81.0) in its 20th year of operation, and its associated software, including updates to database content, the release of a new website and REST API, and performance improvements in InterProScan.


Asunto(s)
Bases de Datos de Proteínas , Proteínas/química , Secuencia de Aminoácidos , COVID-19/metabolismo , Internet , Anotación de Secuencia Molecular , Dominios Proteicos , Mapas de Interacción de Proteínas , SARS-CoV-2/metabolismo , Alineación de Secuencia
16.
Viruses ; 13(1)2020 12 30.
Artículo en Inglés | MEDLINE | ID: covidwho-1004764

RESUMEN

In 2019, a novel coronavirus, SARS-CoV-2/nCoV-19, emerged in Wuhan, China, and has been responsible for the current COVID-19 pandemic. The evolutionary origins of the virus remain elusive and understanding its complex mutational signatures could guide vaccine design and development. As part of the international "CoronaHack" in April 2020, we employed a collection of contemporary methodologies to compare the genomic sequences of coronaviruses isolated from human (SARS-CoV-2; n = 163), bat (bat-CoV; n = 215) and pangolin (pangolin-CoV; n = 7) available in public repositories. We have also noted the pangolin-CoV isolate MP789 to bare stronger resemblance to SARS-CoV-2 than other pangolin-CoV. Following de novo gene annotation prediction, analyses of gene-gene similarity network, codon usage bias and variant discovery were undertaken. Strong host-associated divergences were noted in ORF3a, ORF6, ORF7a, ORF8 and S, and in codon usage bias profiles. Last, we have characterised several high impact variants (in-frame insertion/deletion or stop gain) in bat-CoV and pangolin-CoV populations, some of which are found in the same amino acid position and may be highlighting loci of potential functional relevance.


Asunto(s)
Biodiversidad , COVID-19/virología , Quirópteros/virología , Coronavirus/genética , Pangolines/virología , SARS-CoV-2/genética , Animales , Coronavirus/clasificación , Evolución Molecular , Redes Reguladoras de Genes , Genoma Viral , Genómica , Especificidad del Huésped , Humanos , Anotación de Secuencia Molecular , Filogenia , Alineación de Secuencia
17.
Zool Res ; 41(6): 705-708, 2020 Nov 18.
Artículo en Inglés | MEDLINE | ID: covidwho-982981

RESUMEN

Since the first reported severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection in December 2019, coronavirus disease 2019 (COVID-19) has become a global pandemic, spreading to more than 200 countries and regions worldwide. With continued research progress and virus detection, SARS-CoV-2 genomes and sequencing data have been reported and accumulated at an unprecedented rate. To meet the need for fast analysis of these genome sequences, the National Genomics Data Center (NGDC) of the China National Center for Bioinformation (CNCB) has established an online coronavirus analysis platform, which includes de novoassembly, BLAST alignment, genome annotation, variant identification, and variant annotation modules. The online analysis platform can be freely accessed at the 2019 Novel Coronavirus Resource (2019nCoVR) (https://bigd.big.ac.cn/ncov/online/tools).


Asunto(s)
Betacoronavirus/genética , Biología Computacional/métodos , Infecciones por Coronavirus/diagnóstico , Genoma Viral/genética , Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Neumonía Viral/diagnóstico , Animales , Betacoronavirus/clasificación , Betacoronavirus/fisiología , COVID-19 , China , Biología Computacional/organización & administración , Infecciones por Coronavirus/virología , Variación Genética , Humanos , Internet , Anotación de Secuencia Molecular , Pandemias , Neumonía Viral/virología , SARS-CoV-2
18.
J Proteome Res ; 19(11): 4553-4566, 2020 11 06.
Artículo en Inglés | MEDLINE | ID: covidwho-974862

RESUMEN

While the COVID-19 pandemic is causing important loss of life, knowledge of the effects of the causative SARS-CoV-2 virus on human cells is currently limited. Investigating protein-protein interactions (PPIs) between viral and host proteins can provide a better understanding of the mechanisms exploited by the virus and enable the identification of potential drug targets. We therefore performed an in-depth computational analysis of the interactome of SARS-CoV-2 and human proteins in infected HEK 293 cells published by Gordon et al. (Nature2020, 583, 459-468) to reveal processes that are potentially affected by the virus and putative protein binding sites. Specifically, we performed a set of network-based functional and sequence motif enrichment analyses on SARS-CoV-2-interacting human proteins and on PPI networks generated by supplementing viral-host PPIs with known interactions. Using a novel implementation of our GoNet algorithm, we identified 329 Gene Ontology terms for which the SARS-CoV-2-interacting human proteins are significantly clustered in PPI networks. Furthermore, we present a novel protein sequence motif discovery approach, LESMoN-Pro, that identified 9 amino acid motifs for which the associated proteins are clustered in PPI networks. Together, these results provide insights into the processes and sequence motifs that are putatively implicated in SARS-CoV-2 infection and could lead to potential therapeutic targets.


Asunto(s)
Betacoronavirus , Infecciones por Coronavirus , Interacciones Huésped-Patógeno/genética , Pandemias , Neumonía Viral , Mapas de Interacción de Proteínas , Algoritmos , Secuencias de Aminoácidos , Betacoronavirus/química , Betacoronavirus/metabolismo , Betacoronavirus/patogenicidad , COVID-19 , Análisis por Conglomerados , Infecciones por Coronavirus/metabolismo , Infecciones por Coronavirus/virología , Ontología de Genes , Células HEK293 , Humanos , Anotación de Secuencia Molecular , Neumonía Viral/metabolismo , Neumonía Viral/virología , Unión Proteica , Mapas de Interacción de Proteínas/genética , Mapas de Interacción de Proteínas/fisiología , Proteínas/química , Proteínas/clasificación , Proteínas/genética , Proteínas/metabolismo , SARS-CoV-2 , Proteínas Virales/química , Proteínas Virales/genética , Proteínas Virales/metabolismo
19.
Nucleic Acids Res ; 49(D1): D480-D489, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: covidwho-944363

RESUMEN

The aim of the UniProt Knowledgebase is to provide users with a comprehensive, high-quality and freely accessible set of protein sequences annotated with functional information. In this article, we describe significant updates that we have made over the last two years to the resource. The number of sequences in UniProtKB has risen to approximately 190 million, despite continued work to reduce sequence redundancy at the proteome level. We have adopted new methods of assessing proteome completeness and quality. We continue to extract detailed annotations from the literature to add to reviewed entries and supplement these in unreviewed entries with annotations provided by automated systems such as the newly implemented Association-Rule-Based Annotator (ARBA). We have developed a credit-based publication submission interface to allow the community to contribute publications and annotations to UniProt entries. We describe how UniProtKB responded to the COVID-19 pandemic through expert curation of relevant entries that were rapidly made available to the research community through a dedicated portal. UniProt resources are available under a CC-BY (4.0) license via the web at https://www.uniprot.org/.


Asunto(s)
Biología Computacional/métodos , Curaduría de Datos/métodos , Bases de Datos de Proteínas , Bases del Conocimiento , Proteoma/metabolismo , Proteómica/métodos , COVID-19/epidemiología , COVID-19/prevención & control , COVID-19/virología , Humanos , Internet , Anotación de Secuencia Molecular/métodos , Pandemias , Proteoma/genética , SARS-CoV-2/genética , SARS-CoV-2/metabolismo , SARS-CoV-2/fisiología , Interfaz Usuario-Computador , Proteínas Virales/genética , Proteínas Virales/metabolismo
20.
Nucleic Acids Res ; 49(D1): D1046-D1057, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: covidwho-939577

RESUMEN

For more than two decades, the UCSC Genome Browser database (https://genome.ucsc.edu) has provided high-quality genomics data visualization and genome annotations to the research community. As the field of genomics grows and more data become available, new modes of display are required to accommodate new technologies. New features released this past year include a Hi-C heatmap display, a phased family trio display for VCF files, and various track visualization improvements. Striving to keep data up-to-date, new updates to gene annotations include GENCODE Genes, NCBI RefSeq Genes, and Ensembl Genes. New data tracks added for human and mouse genomes include the ENCODE registry of candidate cis-regulatory elements, promoters from the Eukaryotic Promoter Database, and NCBI RefSeq Select and Matched Annotation from NCBI and EMBL-EBI (MANE). Within weeks of learning about the outbreak of coronavirus, UCSC released a genome browser, with detailed annotation tracks, for the SARS-CoV-2 RNA reference assembly.


Asunto(s)
COVID-19/prevención & control , Biología Computacional/métodos , Bases de Datos Genéticas , Genoma/genética , Genómica/métodos , SARS-CoV-2/genética , Animales , COVID-19/epidemiología , COVID-19/virología , Curaduría de Datos/métodos , Epidemias , Humanos , Internet , Ratones , Anotación de Secuencia Molecular/métodos , SARS-CoV-2/fisiología , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA